1,782 research outputs found

    Identifying variants that contribute to linkage for dichotomous and quantitative traits in extended pedigrees

    Get PDF
    Compared to genome-wide association analysis, linkage analysis is less influenced by allelic heterogeneity. The use of linkage information in large families should provide a great opportunity to identify less frequent variants. We perform a linkage scan for both dichotomous and quantitative traits in eight extended families. For the dichotomous trait, we identified one linkage region on chromosome 4q. For quantitative traits, we identified two regions on chromosomes 4q and 6p for Q1 and one region on chromosome 6q for Q2. To identify variants that contribute to these linkage signals, we performed standard association analysis in genomic regions of interest. We also screened less frequent variants in the linkage region based on the risk ratio and phenotypic distribution among carriers. Two rare variants at VEGFC and one common variant on chromosome 4q conferred the greatest risk for the dichotomous trait. We identified two rare variants on chromosomes 4q (VEGFC) and 6p (VEGFA) that explain 12.4% of the total phenotypic variance of trait Q1. We also identified four variants (including one at VNN3) on chromosome 6q that are able to drop the linkage LOD from 3.7 to 1.0. These results suggest that the use of classical linkage and association methods in large families can provide a useful approach to identifying variants that are responsible for diseases and complex traits in families

    Handling linkage disequilibrium in qualitative trait linkage analysis using dense SNPs: a two-step strategy

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In affected sibling pair linkage analysis, the presence of linkage disequilibrium (LD) has been shown to lead to overestimation of the number of alleles shared identity-by-descent (IBD) among sibling pairs when parents are ungenotyped. This inflation results in spurious evidence for linkage even when the markers and the disease locus are not linked. In our study, we first theoretically evaluate how inflation in IBD probabilities leads to overestimation of a nonparametric linkage (NPL) statistic under the assumption of linkage equilibrium. Next, we propose a two-step processing strategy in order to systematically evaluate approaches to handle LD. Based on the observed inflation of expected logarithm of the odds ratio (LOD) from our theoretical exploration, we implemented our proposed two-step processing strategy. Step 1 involves three techniques to filter a dense set of markers. In step 2, we use the selected subset of markers from step 1 and apply four different methods of handling LD among dense markers: 1) marker thinning (MT); 2) recursive elimination; 3) SNPLINK; and 4) LD modeling approach in MERLIN. We evaluate relative performance of each method through simulation.</p> <p>Results</p> <p>We observed LOD score inflation only when the parents were ungenotyped. For a given number of markers, all approaches evaluated for each type of LD threshold performed similarly; however, RE approach was the only one that eliminated the LOD score bias. Our simulation results indicate a reduction of approximately 75% to complete elimination of the LOD score inflation while maintaining the information content (IC) when setting a tolerable squared correlation coefficient LD threshold (r<sup>2</sup>) above 0.3 for or 2 SNPs per cM using MT.</p> <p>Conclusion</p> <p>We have established a theoretical basis of how inflated IBD information among dense markers overestimates a NPL statistic. The two-step processing strategy serves as a useful framework to systematically evaluate relative performance of different methods to handle LD.</p

    Influence of genotyping error in linkage mapping for complex traits – an analytic study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Despite the current trend towards large epidemiological studies of unrelated individuals, linkage studies in families are still thoroughly being utilized as tools for disease gene mapping. The use of the single-nucleotide-polymorphisms (SNP) array technology in genotyping of family data has the potential to provide more informative linkage data. Nevertheless, SNP array data are not immune to genotyping error which, as has been suggested in the past, could dramatically affect the evidence for linkage especially in selective designs such as affected sib pair (ASP) designs. The influence of genotyping error on selective designs for continuous traits has not been assessed yet.</p> <p>Results</p> <p>We use the identity-by-descent (IBD) regression-based paradigm for linkage testing to analytically quantify the effect of simple genotyping error models under specific selection schemes for sibling pairs. We show, for example, that in extremely concordant (EC) designs, genotyping error leads to decreased power whereas it leads to increased type I error in extremely discordant (ED) designs. Perhaps surprisingly, the effect of genotyping error on inference is most severe in designs where selection is least extreme. We suggest a genomic control for genotyping errors via a simple modification of the intercept in the regression for linkage.</p> <p>Conclusion</p> <p>This study extends earlier findings: genotyping error can substantially affect type I error and power in selective designs for continuous traits. Designs involving both EC and ED sib pairs are fairly immune to genotyping error. When those designs are not feasible the simple genomic control strategy that we suggest offers the potential to deliver more robust inference, especially if genotyping is carried out by SNP array technology.</p

    Visualization of Shared Genomic Regions and Meiotic Recombination in High-Density SNP Data

    Get PDF
    A fundamental goal of single nucleotide polymorphism (SNP) genotyping is to determine the sharing of alleles between individuals across genomic loci. Such analyses have diverse applications in defining the relatedness of individuals (including unexpected relationships in nominally unrelated individuals, or consanguinity within pedigrees), analyzing meiotic crossovers, and identifying a broad range of chromosomal anomalies such as hemizygous deletions and uniparental disomy, and analyzing population structure.We present SNPduo, a command-line and web accessible tool for analyzing and visualizing the relatedness of any two individuals using identity by state. Using identity by state does not require prior knowledge of allele frequencies or pedigree information, and is more computationally tractable and is less affected by population stratification than calculating identity by descent probabilities. The web implementation visualizes shared genomic regions, and generates UCSC viewable tracks. The command-line version requires pedigree information for compatibility with existing software and determining specified relationships even though pedigrees are not required for IBS calculation, generates no visual output, is written in portable C++, and is well-suited to analyzing large datasets. We demonstrate how the SNPduo web tool identifies meiotic crossover positions in siblings, and confirm our findings by visualizing meiotic recombination in synthetic three-generation pedigrees. We applied SNPduo to 210 nominally unrelated Phase I / II HapMap samples and, consistent with previous findings, identified six undeclared pairs of related individuals. We further analyzed identity by state in 2,883 individuals from multiplex families with autism and identified a series of anomalies including related parents, an individual with mosaic loss of chromosome 18, an individual with maternal heterodisomy of chromosome 16, and unexplained replicate samples.SNPduo provides the ability to explore and visualize SNP data to characterize the relatedness between individuals. It is compatible with, but distinct from, other established analysis software such as PLINK, and performs favorably in benchmarking studies for the analyses of genetic relatedness

    Parameter Estimation and Quantitative Parametric Linkage Analysis with GENEHUNTER-QMOD

    Get PDF
    Objective: We present a parametric method for linkage analysis of quantitative phenotypes. The method provides a test for linkage as well as an estimate of different phenotype parameters. We have implemented our new method in the program GENEHUNTER-QMOD and evaluated its properties by performing simulations. Methods: The phenotype is modeled as a normally distributed variable, with a separate distribution for each genotype. Parameter estimates are obtained by maximizing the LOD score over the normal distribution parameters with a gradient-based optimization called PGRAD method. Results: The PGRAD method has lower power to detect linkage than the variance components analysis (VCA) in case of a normal distribution and small pedigrees. However, it outperforms the VCA and Haseman-Elston regression for extended pedigrees, nonrandomly ascertained data and non-normally distributed phenotypes. Here, the higher power even goes along with conservativeness, while the VCA has an inflated type I error. Parameter estimation tends to underestimate residual variances but performs better for expectation values of the phenotype distributions. Conclusion: With GENEHUNTER-QMOD, a powerful new tool is provided to explicitly model quantitative phenotypes in the context of linkage analysis. It is freely available at http://www.helmholtz-muenchen.de/genepi/downloads. Copyright (C) 2012 S. Karger AG, Base

    phenosim - A software to simulate phenotypes for testing in genome-wide association studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>There is a great interest in understanding the genetic architecture of complex traits in natural populations. Genome-wide association studies (GWAS) are becoming routine in human, animal and plant genetics to understand the connection between naturally occurring genotypic and phenotypic variation. Coalescent simulations are commonly used in population genetics to simulate genotypes under different parameters and demographic models.</p> <p>Results</p> <p>Here, we present <monospace>phenosim</monospace>, a software to add a phenotype to genotypes generated in time-efficient coalescent simulations. Both qualitative and quantitative phenotypes can be generated and it is possible to partition phenotypic variation between additive effects and epistatic interactions between causal variants. The output formats of <monospace>phenosim</monospace> are directly usable as input for different GWAS tools. The applicability of <monospace>phenosim</monospace> is shown by simulating a genome-wide association study in <it>Arabidopsis thaliana</it>.</p> <p>Conclusions</p> <p>By using the coalescent approach to generate genotypes and <monospace>phenosim</monospace> to add phenotypes, the data sets can be used to assess the influence of various factors such as demography, genetic architecture or selection on the statistical power of association methods to detect causal genetic variants under a wide variety of population genetic scenarios. <monospace>phenosim</monospace> is freely available from the authors' website <url>http://evoplant.uni-hohenheim.de</url></p

    QTLs for height: results of a full genome scan in Dutch sibling pairs.

    Get PDF
    Height is a highly heritable, complex trait. At present, the genes responsible for the variation in height have not yet been identified. This paper summarizes the results of previous linkage studies and presents results of an additional linkage analysis. Using data from the Netherlands Twin Register, a sib-pair-based linkage analysis for adult height was conducted. For 513 sib-pairs from 174 families complete genome scans and adult height were available. The strongest evidence for linkage was found for a region on chromosome 6, near markers D6S1053 and D6S1031 (LOD = 2.32). This replicated previous findings in other data sets. LOD scores ranging from 1.53 to 2.04 were found for regions on chromosomes 1, 5, 8, 10, and 18. The region on chromosome 18 (LOD = 1.83) also corresponded with the results of previous studies. Several chromosomal regions are now implied in the variance in height, but further study is needed to draw definite conclusions with regard to the significance of these regions for adult heigh

    Genetic variants regulating ORMDL3 expression contribute to the risk of childhood asthma

    Full text link
    Asthma is caused by a combination of poorly understood genetic and environmental factors(1,2). We have systematically mapped the effects of single nucleotide polymorphisms ( SNPs) on the presence of childhood onset asthma by genome-wide association. We characterized more than 317,000 SNPs in DNA from 994 patients with childhood onset asthma and 1,243 non-asthmatics, using family and case-referent panels. Here we show multiple markers on chromosome 17q21 to be strongly and reproducibly associated with childhood onset asthma in family and case-referent panels with a combined P value of P < 10(-12). In independent replication studies the 17q21 locus showed strong association with diagnosis of childhood asthma in 2,320 subjects from a cohort of German children (P=0.0003) and in 3,301 subjects from the British 1958 Birth Cohort (P=0.0005). We systematically evaluated the relationships between markers of the 17q21 locus and transcript levels of genes in Epstein - Barr virus (EBV)-transformed lymphoblastoid cell lines from children in the asthma family panel used in our association study. The SNPs associated with childhood asthma were consistently and strongly associated (P < 10(-22)) in cis with transcript levels of ORMDL3, a member of a gene family that encodes transmembrane proteins anchored in the endoplasmic reticulum(3). The results indicate that genetic variants regulating ORMDL3 expression are determinants of susceptibility to childhood asthma.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/62682/1/nature06014.pd

    Power calculations using exact data simulation: A useful tool for genetic study designs.

    Get PDF
    Statistical power calculations constitute an essential first step in the planning of scientific studies. If sufficient summary statistics are available, power calculations are in principle straightforward and computationally light. In designs, which comprise distinct groups (e.g., MZ & DZ twins), sufficient statistics can be calculated within each group, and analyzed in a multi-group model. However, when the number of possible groups is prohibitively large (say, in the hundreds), power calculations on the basis of the summary statistics become impractical. In that case, researchers may resort to Monte Carlo based power studies, which involve the simulation of hundreds or thousands of replicate samples for each specified set of population parameters. Here we present exact data simulation as a third method of power calculation. Exact data simulation involves a transformation of raw data so that the data fit the hypothesized model exactly. As in power calculation with summary statistics, exact data simulation is computationally light, while the number of groups in the analysis has little bearing on the practicality of the method. The method is applied to three genetic designs for illustrative purposes
    corecore